fix: race condition in shared runtime services (#37825)#37852
fix: race condition in shared runtime services (#37825)#37852ormsbee merged 1 commit intoopenedx:release/ulmofrom
Conversation
There is a singleton SplitMongoModuleStore instance that is returned whenever we call the ubiquitous modulestore() function (wrapped in a MixedModuleStore). During initialization, SplitMongoModuleStore sets up a small handful of XBlock runtime services that are intended to be shared globally: i18n, fs, cache. When we get an individual block back from the store using get_item(), SplitMongoModuleStore creates a SplitModuleStoreRuntime using SplitMongoModuleStore.create_runtime(). These runtimes are intended to be modified on a per-item, and later per-user basis (using prepare_runtime_for_user()). Prior to this commit, the create_runtime() method was assigning the globally shared SplitMongoModuleStore.services dict directly to the newly instantiated SplitModuleStoreRuntime. This meant that even though each block had its own _services dict, they were all in fact pointing to the same underlying object. This exposed us to a risk of multiple threads contaminating each other's SplitModuleStoreRuntime services when deployed under load in multithreaded mode. We believe this led to a race condition that caused student submissions to be mis-scored in some cases. This commit makes a copy of the SplitMongoModuleStore.services dict for each SplitModuleStoreRuntime. The baseline global services are still shared, but other per-item and per-user services are now better isolated from each other. This commit also includes a small modification to the PartitionService, which up until this point had relied on the (incorrect) shared instance behavior. The details are provided in the comments in the PartitionService __init__(). It's worth noting that the historical rationale for having a singleton ModuleStore instance is that the ModuleStore used to be extremely expensive to initialize. This was because at one point, the init process required reading entire XML-based courses into memory, or pre-computing complex field inheritance caches. This is no longer the case, and SplitMongoModuleStore initialization is in the 1-2 ms range, with most of that being for PyMongo's connection setup. We should try to fully remove the global singleton in the Verawood release cycle in order to make this kind of bug less likely.
|
Thanks for the pull request, @marslanabdulrauf! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
Description
This is Backport PR of: #37825